Outline


background

topic modelling

revtools

alternatives


L Bormann & R Mutz (2015) Growth rates of modern science:
A bibliometric analysis based on the number of publications and cited references.
JAIST 66(11): 2215-2222

NR Haddaway & MJ Westgate (in review)
A tool for predicting the time taken to conduct an environmental systematic review



Wouldn’t it be great
to see patterns in search data,
rather than read lists of text?

topic modelling

topic modelling

  • relatively old method for text mining (2003)
  • widely used (22,458 citations)
  • assumes that documents contain >1 ‘topics’ in different proportions
  • number of topics is user-specified

topic models in R

  • standardised import of multiple formats
  • flexible de-duplication
  • interactive visualisation and article selection

https://revtools.net

Import and de-duplication

library(revtools) # load
data <- read_bibliography("data/Westgate_scopus_2018_04.ris") # import
summary(data)
## Object of class 'bibliography' containing 25 entries.
##   Number containing abstracts: 25 (100%)
## Number of sources: 16
## Most common sources:
##   Biological Conservation (n = 3)
##   Conservation Biology (n = 3)
##   PLoS ONE (n = 3)
##   Ecography (n = 2)
##   Journal of Applied Ecology (n = 2)
print(data[1])
## M.J. Westgate, N.R. Haddaway et al. (2018) Software support for environmental evidence synthesis. Nature Ecology and Evolution 2: 588-590

Import and de-duplication

data <- read_bibliography("data/Westgate_scopus_2018_04.ris") 
str(as.data.frame(data))
## 'data.frame':    25 obs. of  14 variables:
##  $ label   : chr  "Westgate_2018_NatEcoandEvo" "Westgate_2018_DivandDis" "Westgate_2017_ConsBiol" "Westgate_2017_Ecog" ...
##  $ type    : chr  "JOUR" "JOUR" "JOUR" "JOUR" ...
##  $ author  : chr  "Westgate, M.J. and Haddaway, N.R. and Cheng, S.H. and McIntosh, E.J. and Marshall, C. and Lindenmayer, D.B." "Westgate, M.J. and MacGregor, C. and Scheele, B.C. and Driscoll, D.A. and Lindenmayer, D.B." "Westgate, M.J. and Lindenmayer, D.B." "Westgate, M.J. and Tulloch, A.I.T. and Barton, P.S. and Pierson, J.C. and Lindenmayer, D.B." ...
##  $ year    : chr  "2018" "2018" "2017" "2017" ...
##  $ title   : chr  "Software support for environmental evidence synthesis" "Effects of time since fire on frog occurrence are altered by isolation, vegetation and fire frequency gradients" "The difficulties of systematic reviews" "Optimal taxonomic groups for biodiversity assessment a meta-analytic approach" ...
##  $ journal : chr  "Nature Ecology and Evolution" "Diversity and Distributions" "Conservation Biology" "Ecography" ...
##  $ volume  : chr  "2" "24" "31" "40" ...
##  $ number  : chr  "4" "1" "5" "4" ...
##  $ pages   : chr  "588-590" "82-91" "1002-1007" "539-548" ...
##  $ abstract: chr  "Evidence-based environmental management is being hindered by difficulties in locating, interpreting and synthes"| __truncated__ "Aim To quantify how frogs in terrestrial environments respond to recurrent fire, and to what extent this is med"| __truncated__ "The need for robust evidence to support conservation actions has driven the adoption of systematic approaches t"| __truncated__ "A fundamental decision in biodiversity assessment is the selection of one or more study taxa, a choice that is "| __truncated__ ...
##  $ doi     : chr  "10.1038/s41559-018-0502-x" "10.1111/ddi.12659" "10.1111/cobi.12890" "10.1111/ecog.02318" ...
##  $ url     : chr  "https//www.scopus.com/inward/record.uri?eid=2-s2.0-85044402000&doi=10.1038%2fs41559-018-0502-x&partnerID=40&md5"| __truncated__ "https//www.scopus.com/inward/record.uri?eid=2-s2.0-85037556950&doi=10.1111%2fddi.12659&partnerID=40&md5=bd03c28"| __truncated__ "https//www.scopus.com/inward/record.uri?eid=2-s2.0-85021013390&doi=10.1111%2fcobi.12890&partnerID=40&md5=131a71"| __truncated__ "https//www.scopus.com/inward/record.uri?eid=2-s2.0-84963987432&doi=10.1111%2fecog.02318&partnerID=40&md5=ace38b"| __truncated__ ...
##  $ address : chr  "Fenner School of Environment and Society, Australian National University, A-Acton, ACT, Australia and Mistra Ev"| __truncated__ "Fenner School of Environment and Society, The Australian National University, Canberra, ACT, Australia and Long"| __truncated__ "Fenner School of Environment and Society, The Australian National University, Canberra, ACT, Australia and ARC "| __truncated__ "The Fenner School of Environment and Society, The Australian National Univ., Canberra, ACT, Australia" ...
##  $ keywords: chr  NA "amphibians and disturbance and fire regime and pyrodiversity" "anlisis de texto and bias and margen de error and meta-analysis and meta-anlisis and sinonimia and synonymy and"| __truncated__ NA ...



Total search results: 31,369 - Estimated number of unique articles: 18,433 - Number of pairwise links: 22,311

MJ Westgate PS Barton JC Pierson & DB Lindenmayer (2015) Text analysis tools for identification of emerging topics and research gaps in conservation science. Conservation Biology 29(6):1606-1614

revtools is:

  • primarily a visualisation tool, with screening capacity
  • command line only for now, with possible option to be standalone/web-based in future
  • no dynamic article recommendation (but stick around for colandr)
  • part of the R community, building on the benefits of that ecosystem

other options

  • SR Toolbox: searchable repo for software options
  • metagear: R package for (manual) SR & meta-analysis support
  • colandr: machine learning support for article screening
  • robotreviewer: text mining, figure extraction


~ special thanks to ~
Ecological Society of Australia

Fenner School of Environment and Society
Australian National University

ARC Centre of Excellence for Environmental Decisions

~ further information ~
website: https://revtools.net
download: https://CRAN.R-project.org/package=revtools
source: https://github.com/mjwestgate/revtools

~ this presentation was created using R ~
rendering: rmarkdown
analysis: revtools | tm | topicmodels | ade4
visualisation: plotly | ggridges | viridis

~ contact ~
martin.westgate@anu.edu.au | @westgatecology